6 research outputs found
Graph Aggregation Based Image Modeling and Indexing for Video Annotation
International audienceWith the rapid growth of video multimedia databases and the lack of textual descriptions for many of them, video annotation became a highly desired task. Conventional systems try to annotate a video query by simply finding its most similar videos in the database. Although the video annotation problem has been tackled in the last decade, no attention has been paid to the problem of assembling video keyframes in a sensed way to provide an answer of the given video query when no single candidate video turns out to be similar to the query. In this paper, we introduce a graph based image modeling and indexing system for video annotation. Our system is able to improve the video annotation task by assembling a set of graphs representing different keyframes of different videos, to compose the video query. The experimental results demonstrate the effectiveness of our system to annotate videos that are not possibly annotated by classical approaches
Multiscale fully convolutional denseNet for semantic segmentation
In the computer vision field, semantic segmentation represents a very interesting task. Convolutional Neural Network
methods have shown their great performances in comparison with other semantic segmentation methods. In
this paper, we propose a multiscale fully convolutional DenseNet approach for semantic segmentation. Our approach
is based on the successful fully convolutional DenseNet method. It is reinforced by integrating a multiscale
kernel prediction after the last dense block which performs model averaging over different spatial scales and provides
more flexibility of our network to presume more information. Experiments on two semantic segmentation
benchmarks: CamVid and Cityscapes have shown the effectiveness of our approach which has outperformed many
recent works
Semantic Segmentation using Reinforced Fully Convolutional DenseNet with Multiscale Kernel
International audienceIn recent years, semantic segmentation has become one of the most active tasks of the computer vision field. Its goal is to group image pixels into semantically meaningful regions. Deep learning methods, in particular those who use convolutional neural network (CNN), have shown a big success for the semantic segmentation task. In this paper, we will introduce a semantic segmentation system using a reinforced fully convolutional densenet with multiscale kernel prediction method. Our main contribution is to build an encoder-decoder based architecture where we increase the width of dense block in the encoder part by conducting recurrent connections inside the dense block. The resulting network structure is called wider dense block where each dense block takes not only the output of the previous layer but also the initial input of the dense block. These recurrent structure emulates the human brain system and helps to strengthen the extraction of the target features. As a result, our network becomes deeper and wider with no additional parameters used because of weights sharing. Moreover, a multiscale convolutional layer has been conducted after the last dense block of the decoder part to perform model averaging over different spatial scales and to provide a more flexible method. This proposed method has been evaluated on two semantic segmentation benchmarks: CamVid and Cityscapes. Our method outperforms many recent works from the state of the art